Models and applications of Optimal Transport 
in Economics, Traffic and Urban Planning 



Filippo Santambrogio* 

Grenoble, June 2009 
Revised version : March 23rd, 2010 



Abstract 

Some optimization or equilibrium problems involving somehow the concept of optimal trans- 
port are presented in these notes, mainly devoted to applications to economic and game theory 
settings. A variant model of transport, taking into account traffic congestion effects is the first 
topic, and it shows various links with Monge-Kantorovich theory and PDEs. Then, two models 
for urban planning are introduced. The last section is devoted to two problems from economics 
and their translation in the language of optimal transport. 
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Introduction 



These lecture notes will present the main issues and ideas of some variational problems that use or 
touch the theory of Optimal Transportation. They all come from economic-oriented applications. 
Problems will be presented through the main ideas, with almost no proofs. As I did during the 
class in Grenoble and for the other lecture notes I will try to keep a very informal level of 

presentation. 

The first topic I will present is a variant of the usual Monge problem, which takes into account 
congestion effects in the transportation. It has a more dynamical taste since it asks for looking at the 
trajectories followed by the particles instead of simple pairs (x, T(x)). The starting point will be the 
equivalent formulation by Beckmann, modified in order to take into account non-uniform metrics 
and then traffic intensity. In this way one obtains a model which is linked to game theory and 
Nash equilibria and has a well-known discrete counterpart on networks. Moreover, the optimality 
conditions for the minimization let a classical transport problem appear (for a metric which is not 
known a priori). 

After this first section, two models for the distribution of some fundamental elements of the 
structure of urban regions (residents, jobs, industrial areas, services. . . ) are discussed. The first 
is purely variational: we suppose that these distribution optimize a total welfare functional, as if 
a benevolent planner could control the city; the functional let a transport cost (say, a Wasserstein 
distance) appear explicitly. The second deals with much more delicate equilibrium issues on the 
agents' choices and it does not addresses explicitly any transport problem in its formulation; yet, 
the theory of Monge-Kantorovitch is very useful in its resolution. In both cases the Kantorovitch 
potential play an essential role. 

By the end of this third section, the reader who is not familiar with economic theories should 
have got more accustomed to the some typical concepts of the rational behavior of the consumers, 
and he will be ready for Section 4. This section presents two classical problems in economics (in 
situations of competition and of monopoly, respectively) and it shows how to translate them into 
problems involving transport costs and Kantorovitch potentials. 

1 Traffic congestion 

1.1 Generalizations of Beckmann's Problem 

We saw in the introductory lecture notes [26J the problem (B): 




where M(A) denotes the mass of the vector measure A and we said that it is equivalent to the original 
problem of Monge. Actually, one way to produce a solution to this divergence-constrained problem, 
is the following: take an optimal transport plan 7 and build a vector measure v-y defined through 
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for every 4> € C°(il;IR d ), u) xy being a parametrization of the segment [x,y]. 

It is not difficult to check that this measure satisfies the divergence constraint, since if one takes 
(f) = Vip then 

J u' x , y {t) ■ 4>{u x , y {t)) = J —(ip(u} x>y (t))dt = ip(y)-ip(x) 

and hence < t> 7 , Vip >= J ij) d{y — fi). 

To estimate its mass we can see that \v 7 \ < cr 7 , where the scalar measure cr 7 is defined through 

<cr 7 , (/>>:= f f \cj'(t)\(f>(LJ x ,y(t))dtdj, V^C°(n;K) 
JnxnJo 

and it is called transport density. The mass of <r 7 is obviously 

dcr 7 = / / \u' Xt y{t)\dtd-f = I \x-y\dj = Wi(n,u), 



o 



which proves the optimality of t> 7 . 

It is interesting to investigate whether <r 7 << C d , since this would imply that Problem (B) is 
well-posed in L 1 instead of the space of vector measure. For the sake of the variants that we will 
see later on, it would be interesting to give conditions so that <r 7 E L p as well. All these subjects 
have been widely studied by De Pascale, Pratelli (see [TH [T71 [18] ) but there is a more recent (and 
shorter) proof of the same estimates in [25J. It is in particular true that fi, v € LP implies that 
cr 7 £ LP and that it is sufficient that one of the two measures is absolutely continuous in order to 
get the same on c 7 . 

The simplest possible generalization of Problem (B) is the following: 



min j k(x)\v (x)\dx : V • v = /U — v 
that corresponds, by duality with the functions u such that |Vn| < k, to 



min J d k (x,y)d-f : 7 Gll(/i,z/), 



where dk(x,y) = inf^^^^ ^^)^ L^(a;) := £;(u;(i))|u/(i)|di is the distance associated to the Rie- 
mannian metric k. It would be possible to build in this case an optimal u 7 by replacing the curves 
u X) y with the k— geodesies (instead of the segments). 

This generalization above comes from the modelization of a non-uniform cost for the movement 
(due to geographical obstacles or configurations). It can be applied to several situation but it is 
anyway evident that one should look for more realistic models, at least in the case of urban transport. 
In this case the metric k is usually not a priori known, but it depends on the traffic distribution 
itself. 

The simplest model could be considering a metric k{x) = g(\v{x)\) depending through an in- 
creasing function g on the traffic itself (represented by the intensity of v). In this case a very naive 
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model would be obtained by setting H{t) = tg(t) and then solving 

min / H(\v(x)\)dx : V • v = \i — v. 



In most cases, H is strictly convex and this is a strictly convex counterpart to the problem by 
Beckmann (which was somehow suggested by Beckmann himself in his book [3]). Notice that this 
model is not completely realistic neither since it allows for "cancellation" effects: several flows in 
opposite directions at a same point x may give a total vector v(x) = 0, even if the number of 
travellers at x is high. Yet, in Section 3.3 we will see that this simplifed model will turn out to be 
equivalent to a more precise one. 

We just mention that there exist concave variant too, which are known under the name of 
branched transport. This name is used for addressing all the transport problems where the cost 
for a mass m moving on a distance / is proportional to I but subadditive w.r.t. m. Typically, it 
is proportional to a power m a (0 < a < 1). The adjective "branched" in the name stands for 
one of the main features of the optimal solutions: they gather mass together, masses tend to move 
jointly as long as possible, and then they branch towards different destinations, thus giving rise to 
a tree-shaped structure. 

This problem comes from a discrete problem on graphs, where the cost of a graph G whose edges 
eh are weighted with coefficients Wh is of the form ^^p^H i e h)- It nas a continuous generalization 
where the energy to be minimized is 



where v = U (M, 9, £) means that v is a rectifiable measure supported on the set M, with orientation 
£ and density (multiplicity) 9. The energy M a is then minimized under the constraint V • v = \x — v. 

These notes will not develop any more this alternative problem and the reader may find the 
whole theory of branched transport in the recent book by Bernot, Morel and Caselles ([4j). 

1.2 Wardrop equilibria, the discrete case 

We will describe in this section a traffic problem which has some interesting issues on equilibria and 
some interesting relations with optimal transport theory. We will start from the discrete case on 
networks and then generalize to the continuous case. The network case was introduced in [27] and 
then studied in [2]. 

In the discrete framework, one considers 

• A finite graph with edges e € E and a set of sources S and destinations D, 

• the set C(s, d) = {u from s to d} of possible paths from s to d, 

• a demand input 7 = (7(s, d)) s d denoting the quantity of commuters from each s G S to each 
d € D, or a set T of possible 7 (for instance this could be the set of all demands where the 
total number of commuters leaving each point s and the total number arriving to each point 
d are prescribed, but the coupling, i.e. how many commuters for each pair (s, d), is not); 
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• an unknown repartition strategy (to be looked for) q = such that E^eCO d) ^ = 

• a consequent traffic intensity on each edge e (depending on q) i q = (i q (e)) e given by i q {e) = 

• an increasing function g : R + — )• M+ such that g(i q {e)) represents the congestioned cost of e, 

• the cost for each path oj, given by c(uj) = Eeew g(i q (e)). 

The global strategy q represents the overall distribution on choices of commuters' paths. Impos- 
ing a Nash equilibrium condition (no single commuter wants to change his choice, provided all the 
others keep the same strategy) gives the following condition: 

oj G C(s, d), q w > c(oj) = min{c(u)) : oj G C(s, d)}. 

This condition is well-known among geographical economists as Wardrop equilibrium. 

The existence of at least an equilibrium comes from the following variational principle. 

Optimizing an overall congestion cost means minimizing a quantity J(q) := E e ^(^<?( e )) (where 
H : K + — > IR + is an increasing function: for instance if one takes H(t) = tg(t) the value of J(q) 
gives the total cost for all commuters) among all possible strategies q. 

The minimization of J has obviously a solution and one can look for optimality conditions. 
Suppose that H and V are convex, so that the necessary conditions will also be sufficient: it is easy 
to see that q minimizes if and only if, for every other admissible q, one has 

£ff'(i,(e))(i,-(e)-i,(e))>0. 

e 

Set £(e) := H'{i q (e) and rewrite the right hand side has 

E^)(^ e ) - ^) = EE^» - <zM) = E ( E?( e ) ) («h - «("))■ 

e e w9e u> \e6oj / 

This says that, if one sets L^(oj) := Eeeoj£( e )> the optimal g must minimize L^(oj)q(oj), since 

we g Qt Ew-^M^M ^ Eu^H^M- 

This means two facts. First, since the conditions of admissibility on q only look at starting and 
arrival points, it is pointless to put some mass on those curves oj from a source s to a destination 
d such that the value L^(oj) is strictly larger than d^(s, d) := min wgC <( Sirf ) L^(oj). This means that 
q(oj) > and oj € C(s,d) imply L^(oj) = d^(s,d). 

Second, another condition occurs when the demand 7 is not fixed. We said that to optimize 

L^(oj)q(oj) we only use curves where = d^ and this gives 

EM^M^) = E d s( s ' d ) E qu > = E^( s ' d )7( s ^)- 

w s,<i \w€C*(s,<f) / s,c2 
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In particular, one also needs to choose 7 £ T so as to minimize 

^2d^(s,d)j(s,d), 7GT. 

s,d 

This second condition is empty if T only contains one 7 but it is of particular interest when T = 
n(//, is), since it says that 7 must solve a Kantorovitch problem for the cost d^. 

The first condition, on the other hand, always gives some information on q and exactly says: if 
q is optimal, then it is a Wardrop equilibrium for g = H' . 



1.3 Wardrop equilibria, the continuous case and equivalences 

It is possible to give a continuous formulation and prove analogous results (see [E]). In a domain 
£1 C 1" the demands are represented by probabilities 7 € V(£l x f2). We are given a set T C 7'(f2 x f2) 
as the set of admissible demand couplings: usually T = {7} or 

r = II(/i, v) = {7 G x O) : (vr x ) tt 7 = /i, (7^7 = 1/}. 

Let us also set 

C = {Lipschitz paths uj : [0, 1] — > 17} 
C(s, d) = {u € C : w(0) = s,, w(l) = d}. 

We look for a probability Q S ^(C) such that (7To,i)j)Q € T. 

We want to define a traffic intensity iq G such that the quantity i<g(.A) stands for "how 

much " the movement takes place in A. . . For <f> G C°(J1) and w £ C set L^foj) = f (j)(ui(t)\oj' (t)\dt. 

Then we define iq by 

<iQ,0>= y L^(u;)Q(dw) = J ip(u(t))\uj'(t))\dt)Q(dw). 

Notice that this is exactly what happens for the transport density! the traffic intensity iq is a 
generalization of the transport density, since it deals with the case where Q is any measure on C, 
while the transport density only looks at the measure concentrated on the segments [x, y] for (x, y) 
in the support of an optimal 7. 

In this continuous framework, it is more convenient to start from the optimization point of view 
(instead of looking at the equilibrium as a starting point): we minimize the convex functional 



J(Q) 



J H{i Q (x))dx iii Q «C r ' 
+00 otherwise 



among all admissible strategies Q, H being a convex, increasing and superlinear function. Typically 
H(t) = t p , or H{t) = t + t p (which is more reasonable since in general we have g(0) = H'(0) and 
we do not want g(0) = 0: this would mean that moving on an empty road costs nothing, which is 
usually not the case). 
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First one should prove finiteness of the minimum, which is not evident since in the continuus 
case one needs to prove the existence of a Q such that iq £ L p . This is, in the case of fj,, v £ L p , a 
consequence of the summability results on the transport density, since the transport density is, as 
we said, a particular choice for iq. This is why we explicitly cited the LP result fo De Pascale and 
Pratelli (besides its interest in itself). 

It is possible to look for optimality conditions and to reobtain the same Wardrop equilibrium + 
optimization of d%. Here £ will be the metric £(x) = H'(iq(x)). Yet, this function is not continuous 
nor l.s.c. and some efforts should be spent to give a meaning to the concept of geodesic distance in 
the case £ S L q . 

It is also interesting to notice that this problem looks at the movement of some players whose 
individual goal is fixed but whose utility also looks at the density of all the other players (i.e. their 
movement is more expensive if they pass where the density is higher): this seems to be a particular 
case of the so-called Mean Field Games introduced by Lasry and Lions in [20J . 

All the results we cited are valid for the case T = {7} as well as for T = n(/i, v) (all the transport 
plans). 

Yet, in this second case, something more may be said. Instead of defining a scalar traffic intensity 
one can define a vector measure vq by: 



Since H is increasing, this implies that the infimum of the previous problem with iq is larger than 
that of the minimal flow problem: 



where %{v) := H(\v\). 

A natural question, arising for instance from a comparison with the Monge case, where looking 
for the vector or the scalar transport density was the same, is the possible equivalence of the two 
problems. 

One can see that a minimizer of the scalar problem can be built formally from a minimizer of 
the vector one in the following way: if v is the unique solution of the vector problem and \x and 
v are absolutely continuous (so that we will write \i and v for their densities as well), we consider 
the non-autonomous Cauchy problem 



Jn ' JC([0,l];f2) Jo J 

i.e. sort of a vector version of Iq. It is immediate to check that \vq\ < iq, and that 

V • vq = n — v, vq ■ v = on dQ. 






{ 



oo'(s) 
w(0) 




li- 




fer the non-autonomous vector field 




(1.3) 
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The latter will not have any Lipschitz continuity property in general, unless the optimizer v of (jl.ip 
is regular: anyway, if we assume that one can prove v £ Lip(n), then the flow X : [0, 1] x n — > n 
of v is well-defined as the solution of (|1.2|) and we can take \xt as the image of (i through the map 
X(t, •). One can see that fit must coincide with the linear interpolating curve (1 — t)fi + tv (because 
this curve solves the continuity equation thanks to the divergence condition). This yields that 
(X(l, -))^/o = fi, which ensures that X(l, •) transports \i on v. If we now consider the probability 
measure concentrated on the flow, i.e. 

then Q is admissible and it is not difficult to see that iq = \a\, since 



Uq= I I 4>(cj x (t))\cj' x (t)\dtdfi= [ dt [ cj)(u x (t)) \ V ^ x ^l dn= [ dt [ (t>—dii t 



JO 



(J>t{Wx(t)) Jo J 



V 



This finally implies that the minima of the two problems coincide. Moreover, this construction 
provides a transport map (that is X(l, •)) from \x to v, whose transport "rays" evidently do not 
cross and which is monotone on transport "rays" (as a consequence of Cauchy-Lipschitz Theorem). 

Notice that if one wanted to prove rigorously what we stated he should investigate a little bit 
the regularity of the optimal v. This may be done if one writes optimality conditions for v and sees 
that he has v = V%*(Vu) where u solves 

V-VW(Vu) = n-v, inn, , . 

vn*(Vu)-n = o, on an, ^ > 

For H(t) = t 2 this is a simple Laplace equation and regularity theory is well-known. For H{t) = t p 
this gives a p'— Laplace equation and here as well lots of studies have benn done. Yer, for modeling 
reasons, we said that it is important to look at the case H'(0) > 0, and we suggested as a typical 
case 

U(a) = -\v\ p + aid, v G R N , (1.5) 
V 

which leads to a function %* which vanishes on B\. In particular, the corresponding equation for u 
is very very degenerate and regularity results are less studied (see [7|, both for the equivalence with 
the Wardrop problem and for some regularity proofs). 



2 The urban planning of residents and services 

A very simplified model that has been proposed for studying the distribution of residents and services 
in a given urban region n passes through the minimization of a total quantity J- (/J,, v) concerning 
two unknown densities /x and v. 

The two measures [i and v will be searched among probabilities on n. This means that the total 
amounts of population and production are fixed as problem data. The definition of the total cost 
functional to optimize takes into account some criteria we want the two densities \x and v to satisfy: 
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(i) there is a transportation cost T for moving from the residential areas to the services areas; 

(ii) people do not want to live in areas where the density of population is too high; 

(iii) services need to be concentrated as much as possible in order to increase efficiency and decrease 
management costs. 

Fact (i) is described, in its easiest version, through a p-Wasserstein distance (p > 1). We will 
look at T(n,v) = Wg(n,v). 

Fact (ii) will be described by a penalization functional, a kind of total unhappiness of citizens 
due to high density of population, obtained by integrating with respect to the citizens' density their 
personal unhappiness. 

Fact (iii) is modeled by a third term representing costs for managing services once they are 
located according to the distribution u, taking into account that efficiency depends strongly on how 
much v is concentrated. 

The cost functional to be considered is then 

T(fi,is) = T(n,is) + F(ti) + G(v), (2.1) 

where F, G : V(£l) — > [0, +oo] are functionals chosen so that the first one favors spread measures 
and the second one concentrated measures, in suitable senses. 

We stress that this model is a very naive one, since it disregards equilibrium issues and several 
other parameters, and that it could be applied only in those cases where a planner could control the 
whole behavior of the region. We refer to (BJ [9j [lOl l23l 124) for the study of this model and of similar 
ones. 

As far as particular choices for the functionals F and G are concerned, we may consider 

(f n f(u)dC d ifn = u-£ d 
1 +oo otherwise, 

+oo otherwise, 

where the integrand / : [0, +oo] — >■ [0, +oo] is assumed to be lower semicontinuous and convex, with 
/(0) = and superlinear at infinity, that is, 

lim = +oo, 

t^+oo t 

and the function g is required to be subadditive, lower semicontinuous, and such that 

g(0) = and lim ^ = +oo. 

In this form we have two local lower semicontinuous functional on measures (see [5] : a functional 
on measures is said to be local if it is additive on mutually singular measures ). This is a useful class 
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of functionals over measures including both concentration preferring functionals and functionals 
favoring spread measures. 

Without loss of generality, by subtracting constants to the functional F, we can suppose f'(0) = 
0. Due to the assumption /(0) = 0, the ratio f(t)/t is an incremental ratio of the convex function 
/ and thus it is increasing in t. Then, if we write the functional F as 



we can see the quantity f(u)/u, which is increasing in u, as the unhappiness of a single citizen when 
he lives in a place where the population density is u. Integrating it with respect to \i = u ■ C n gives 
a quantity to be seen as the total unhappiness of the population. 

Concerning G, we can think that we are requiring v to be concentrated on a limited number of 
service poles and that the effects of the managing costs and of the production of a pole whose size 
is a are summarized in a cost function g(a). 

For G, there are other interesting choices among functionals which favor concentration. One of 
them could be 



where h is an increasing function and h(\x — y\) stands for the cost of managing the interactions 
between services located at x and at y. This new choice for G is more concerned with the positions 
of the services, and not only with the size of each pole. 

These two choices and other possible models give different interesting results when one looks at 
the minimizers. In the first case several atoms occur in is, and /x is concentrated on balls around 
these centers, which corresponds to sub-cities; in the other a single-center city is obtained. The 
mathematical properties which are obtainable thanks to what we know from the theory of optimal 
transport are remarkable. 

As a simple example, we will mention that the solutions of 



where ip^ is a Kantorovitch potential for the transport from u to v and the cost c(x, y) = \x — y\ p . 

Moreover, in the whole minimization with respect to [i and v, in the particular case T((i, v) = 
Wj and G(y) = A Jq x q \x — y\ 2 v(dx)v(dy), F(fi) = \\l^\\ 2 L 2^y any pair of minimizers (fi,v) is 

shaped as follows: 

• (i is concentrated on a ball B(xo,r\) (intersected with O) and has a density u given by 





{P v ) minimi/) 



for fixed v € 7>(Q.) 



are carachterized by 



H = u- C n ; u = (/') 1 (const - 



+ 



u(x) 



A 



[r\ -\x- x j 2 ); 



2A + 1 
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• v is concentrated on the ball B{xq,t\/ '(2A + 1)) and it is the image of [i under the homothety 
of ratio (2A + 1) _1 and centre xo; 

• xo is the barycentre of both /x and v. 

The main tool for all this results is the following computation: if fj, E = (1 — e)fi + e/ii, then 

lim — E = / ^ d(/ii - yu), 

where ipa, v i s ! again, a Kantorovitch potential for the transport from \i to and the cost |x — 
This formula says that the Kantorovitch potentials stand for Gateaux derivatives of the functional 
Wp(-,v). Thanks to standard convex analysis, it is not difficult to guess it, and to apply it to 
variational problems, if one thinks that the duality formula Wp(fi, v) = sup J ipdfi + ip c dv exactly 
says that this functional is convex and the optimal ip are the element of its subdifferential. 

Notice that this kind of technique for finding optimality conditions of problem such as (P u ) is 
useful in other contexts as well, and in particular, for p = 2, when gradient flows for functional on 
the Wasserstein space W2 are concerned. Actually, the standard minimizing movement procedure 
for the gradient flow of a functional F passes through discrete minimization steps for quantity like 

where r is a time step and a discrete sequence {nk)k is built taking for fik+i the solution of (Pfj, k )- 
We finish the section by saying that other models with different costs, for instance when T is 
no more a Wasserstein distance but comes from a congested or branched transport problem (see 
Section 3), have been investigated as well (see [T5]). 



3 Equilibrium structure of a city 

This second part is devoted to a much more detailed model on the structure of a city which looks at 
an equilibrium configuration for the behavior of residents, firms and landowners. This has a much 
more economical taste and it has been studied by G. Carlier and I. Ekeland in [U\ I12j. 
The elements in this description of the city are the following: 

• a domain Q C R d which stands for the urban region we consider 

• a measure /i = N(x)dx on standing for the residents: this is unknown as well as its mass 

• a measure v = n(x)dx standing for jobs, which is unknown too 

• a transportation cost c(x, y) for commuting inside Q: this is given 

• a wage function ifi : Q — >■ R, where ip(x) stands for the salary that workers employed by the 
firm located at x receive from the firm (this is an unknown of the problem) 
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• a revenue function (f> : Q — >■ R (unknown) standing for the revenues net of commuting cost 
that residents earn: people living at x will solve max y ^/>(y) — c(x,y) := (j>{x) so as to choose 
where to work according to this optimization problem and, conversely, firms located at y will 
decide whom to hire solving min^ <f)(x) + c(x, y) and getting again i^(y) as a (minimal) wage 
to be assured at y so that there are workers who do accept to work at y 

• a same utility function for all the residents U(C, S) depending on their consumption level C 
and on the quantity S of land they use, as well as a fixed utility level u that every agents 
wants to realize: these are given as exogenous (fixed) and u may be thought as the utility level 
outside fl, i.e. the utility realized if one decides to move out of the city 

• a price for residential rent Q{x): at every point x the residents want to choose a consumption 
C and a land surface S so that they obtain at least the utility u, i.e., if Q(x) is known, they 
solve min{C + Q(x)S : U(C, S) > u} and they get the minimal amount of money they need. 
At the equilibrium this amount will necessarily be <j>{x) (i.e. the money they actually have). 
This gives a relation between Q and 4> and finds the optimal value S(x) as well. One obviously 
has N(x) = 1 /S{x) 

• a productivity function z : Q — >■ R which is supposed to depend increasingly on v (say, 
z(x) = v(B(x,r)) or z is obtained through a more general convolution of u: the idea is that 
the productivity is higher where there is a higher concentration of workers) 

• a production f(z, n) which gives the output of a firm employing n workers in a zone where 
the productivity is z 

• a price for industrial rent q(x) which is obtained, at the equilibrium, by imposing that all the 
surplus of the firm may be absorbed by the landlord, so that q(x) = max„ f(z(x),n) — ip(x)n. 
This also gives the optimal value n(x). 

An equilibrium is given by a pair of measures (fJ>,v), some continuous functions z,(f>,ip and a 
transport plan 7 G n(/i, v) such that 

• ji and v have the same mass 

• 7 and the pair (</>, V') are compatible in the sense that 7 is concentrated on the set {(x,y) : 
(j>(x) = ip(y) — c(x,y)} and the inequality <p{x) > ip(y) — c(x,y) holds for every (x,y) 

• z is obtained from v through the productivity relation that we mentioned above 

• once Q and q are computed (depending on (f>, ip and z) one finds the optimal N and n to be 
equal to the densities of fi and u, respectively 

• n is concentrated on Q > q and v on q > Q (this depends on the landlords' behavior: they 
would not rent to residents if renting to firms is more profitable nor viceversa) . 
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The application of the optimal transport theory is straightforward and it allows to pose the 
problem as a fixed-point issue on \i and v: once /i and v are given, one only needs to take for 
7 the optimal transport plan for the cost c and for (j) and ijj the Kantorovitch potentials. This 
is what Carlier and Ekeland did, proving well-posedness results in a framework which was much 
more general than what was studied before in the literature (mainly one-dimensional or radially 
symmetric cases). 

4 Application to Economics: Kantorovitch potential as prices or 
utilities 

4.1 Hotelling 

The Hotelling problem is a double-step equilibrium problem for the strategic location of N firms 
trying to maximize their incomes from a given distribution of consumers in a domain f2, according 
to the following criterion. Notice that the domain may be interpreted in a geographical way, or 
represent the different features of the goods the firms sell. 

If we know the positions re, of the firms and the prices pi that they chose, the consumer locatd 
at x will chose where to buy his good by minimizing the sum c(x, Xj) + pi over i = 1, . . . , N (the 
cost c(x, y) representing for instance the distance from x to y or taking into account the utility that 
x has when he buys a product of type y) . In this way some influence regions 

Ai = {x : Xi minimizes c(x,Xi) + pi} 

and some demands d{ = (i{Ai) are obtained. Every firm wants to maximize the profit pidi and a 
Nash Equilibrium configuration for prices is a choice of the N prices so that no firm wants to change 
its mind (i.e. changing its price pi, supposing that all the other do not change their own prices). 
Supposing that, for every configuration of the positions of the firms, there is a unique equilibrium, 
every firm knows the function associating the profits to positions. An equilibrium configuration is 
hence a configuration where no firm wants to move in order to enhance its profit, provided the other 
do not move (once again, a Nash Equilibrium). The Hotelling problem exactly looks at finding such 
an equilibrium (see |21j). 

An easy but interesting link with optimal transport is the following and concerns the first step 
(i.e. price equilibria). The idea is: instead of taking the prices pi, look at the demands d%. It 
will be possible to reconstruct the pi from the df in order to do that, just consider the measure 
v = Yld=i d%S Xi and prove that the function p : {x\, . . . , rcjv} — > R is a Kantorovitch potential for 
the transport from v to // for the cost c. Once this is done, the problem may be translated into 
a condition on v which involves its Kantorovitch potential. Notice that writing down the precise 
conditions on v involves understanding how the Kantorovitch potential depends on u, which is a 
very delicate issue that we will meet again. 

4.2 Rochet-Chone 

There are different models on the prices that a monopolist firm may impose for the goods it produces. 
One of the mathematically most interesting is the Rochet-Chone model (see [22]), which is an 
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optimisation problem under convexity constraint. The convex structure comes from the simplifying 
assumption that the space of goods y and the space of consumers x are subsets of l w and they are 
coupled through the function (x,y) \— > x ■ y representing the utility that a consumer of type x has 
in buying y. Once the distribution \i of consumers is known, the firm may choose the price for its 
good, i.e. a function p : Y — > [0, oof, defined on the goods space Y; then every consumer x choses 
what to buy by solving 

max x ■ y — p(y) 
y 

and getting a utility u(x) := max^ x ■ y — p(y), realized by a good y x . The firm may reconstruct its 
total gain by integrating p(y(x)) — C(y(x)) (if C(y) is the cost for producing y) . The total profit 
is hence given by 



(p(y(x)) - C(y(x))) fi(dx) = / (x ■ y(x) - u(x) - C(y(x))) fi(dx). 
x Jx 

One can also notice that y x = Vu{x) (differentiating the expression of u) and hence the maximization 
of the profit is a problem that may be stated in terms of u 

maxF(u) = / (x ■ \7u(x) — u(x) — C(\7u(x))) fi(dx) 



Jx 

where the constraint on u are convexity (from its defintion) and positivity (u > is a consequence 
of the fact that consumers do not buy if they get a negative utility: it may be stated saying that a 
certain "empty" good called belongs to Y and that we impose p(0) = 0; the firm is not allowed to 
charge for buying this empty good but this good interests nobody) and a constraint on the gradient: 
Vu € Y. This is the minimization problem under convexity constraint we referred to. It falls into 
the framework of the convexity-constrained problems studied for instance by Carlier and Lachand- 
Robert (see [H]), where some C 1 regularity results are also proven. The same class of problems 
also includes the well-known Newton Problem of minimal resistance. For both the problems, some 
numerical insights in particular cases exist, but lots of information lack. 

An interesting change of variable, using the image measure v = (Vu)jjp, is possible, since every 
measure is the image of p through the gradient of a convex function (which is exactly the well known 
result by Brenier in transport theory, see [8]). It is interesting to link this reformulation to optimal 
transport. The most natural cost to be considered would be the scalar product but we know that 
considering —x ■ y or ^\x — y\ 2 is the same. Hence, we may rewrite the previous problem as 

f f\ x \ 2 \Vu\ 2 ~ \ 

min F(u) = — - x ■ Vu(i) + — — + u(x) + C(Vu(x) n(dx) 
u convex j v \ 2 2 I 

where C(z) = C(z) — \z\ 2 /2 and we are allowed to add the term in \x\ 2 /2 since it does not depend 
on u. 

We can in the end rewrite the problem in terms of v as 



min G(v) = -W%(n,v) + / Cdv + / u u d/i, 
v&v(Y) 2 J Y Jx 
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u v being for a measure v the unique convex function satisfying Vtt#/U = v and minu = (which is 
obtained as a Kantorovitch potential for the cost — x ■ y or \\x — y\ 2 ). 

This kind of functional may be considered via the transport theory. Existence of a minimizer 
is easy and the interesting point is finding optimality conditions. The difficult part is handling the 
term 

u / u v d/i. 
Jx 

For getting optimality conditions, it would be useful to differentiate this term with respect to 
variations of v. Yet, computing 

lim , v E = v + e\v — v) 

e^O e 

is a challenging issue; possible strategies include the linearisation of the Monge-Ampere equation 
but lots of questions are open. 
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